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(51) IWHil 

L. ^^mmm^mizntx^Ltimmim^tf 



i«5 



— en 



0.4 



8 



1 

bA^mmmm.<MBmii'f>. ^if>^-tt>titi^ 

izX'}X1^t^tvf::m^-T-9msmb 

m^i^mmmm-mms^mmmmb-ri, ^ b 

^<b t,-oJJUiOffi««*#Sfc*t UTtmt^S 

{:m^izb^mLb-ri>mm.^mm^, 
imm2} m§Sr-T-i'mmim.-tmtz. ^ 
if>^if>t>tvf::3^iz^^x. mmm^^com/M 

fz^tzm^MS3mmmim.L. 
wmmmbmmmb<omi~T-:!^msmmtim 

mmmsm^m^i^^<f)gckmbS/bmxf«m--T' 

-mmm^f>^if>^lt>iit:^lzX 0#«!^#ffitJt 

[11^3131 tm-T~fmmmmmzis\,^x.9\- 
mmmx-m^Mtr-ftz^m-hizm'^-thAWL 

b. ^Uj:\>^tzUM-thASLb^tmU Ztit>im 
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2 

b-rtm^m i t^ti 2Eso<ffS5«i^#a^flf*^. 

iif::T-i^K-xtpff)T'-:}ftfi^fmMi<^-^iz-^ 

•t?>i}^i)^i:7Frt!i\-^mmmb. mm^mizii^x 
10 iz^/scti>ii^i)^bii-t>, iii^iinm^m<7>» 

^ifxmsiffy^iXij-rtxij^b . 

tiMmtmmstb. 

mm^suzX'oX9Ym^td::m^<tt:b Lfe>ja 

&if>^HJt:m^&±tztSimtK^i£b Lt:jjkm\>ym 

mmmT-mm^'<^^f£b9i^&^ 

fmimrrimmm.^mb . 
i-ii^^mm-^stb. 

^x. (>tvf:i:^X<omR^btSmnM7'-i'<^^-^h 
itizii LX^(>tvf:i^mRX/^i->\^ a (omm^m. 

[if^51 minmi<ry!m^1»m<tbLitk 
40 i6^i6 /c>ti/ciefc: J: 0 Ara<OMKojivv^# 3 ism 

miz^»>^>i^wz^m:Mthms^^^b . 
wmM.^WL\izX'>xm:&^ixttimf^tbLitK 
a. m^m-'^^x*j:\'^bLtimis>h. ^«>^»*. 

50 <i^5«fc:«->TSSAH<0fiIBi<^)Sv«^# $S:tm-t 
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[00011 
[00021 

bmttii. 2'y(7)iiiimm^^h^xmtx^t:. 

(f^i.lfGerand Salton (Ed.). The SMART Retrieval S 
ystea - Experiaents in Autoutic Docuient Processi 
ng. EngleMood Cliffs. NJ. Printice-Hall. 1971. B.H 
asand, G.Linoff, D.Ualtz. Classi^ing News Stories 

using Menory Based Reasoning. Proceedings, 15th I 
nt'l SIGIR 1992. pp. 99-56.19^^0:) 

[Sll 



•01) 



•6) 



[00031 

immmvx^b-ttwim} mimbm^ 

mizxim^K^7-i'(7>mrch'o. m^m<7)m 
^xnms^m^imizxtmst^^ b Ltir-i^n 

b LX^^W)m^m>h t . ^ < ffyl^Udfim^^ 
t-thb. 1Sm^m7M^mL<-^h:ib<,z-^j:hft 

^<nt:it>mmm-m7mLmtmmfi>mz. m 
[00041 mtizm^mt. ^tsm^mz^ s 

ffij ^t-jTV^&fcftfc. 0itifflt3O%fcV^3tt5r 

b-ytibtxii. MmmLX9mimifihhb^ 



40 



[00051 tfcmimfiZiiX^X t . i^««<OJ: d t 
LXLt.ofSm^ihr>ti. 

[00061 ic7)|g®^{i^ «l^^a$iOfffiBffi?:02<O 

^immim-mkzx -iXtmt^^tib^jih^Lif^ 
7smLxtih-r. mvmsicnmib. «^^#-c^: 

\,\^-''fi^Lts:\.^b\^o:Lbif. mwm!izxr>x 
\,^£ti^'>ti^b^^-fh, 
[00071 *l|BBiOBWJi±£<ora®^{c|i*. m. 



(4 

5 

[0008] 

4t<i>*i>ii1>SI5miiSffifcJ:S«l^mc^5l:f 

m^mmsmii&^h, 30 

[0 0 0 91 43t, iNJ«2-Cli. 

[0 0 1 01 t/::. if^3Tli. mmiXli2WMt 
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[ 0 0 1 1 J t^. a*«4T«. ^«>4;t'^>*U:«J^ 

^mthmm^m^ts^fy^x. iJ^^r<i:t- 

tt«!l*ai=J:-o-Clt«l$tl^«!^-r'<.l'fc't Lti/M 
b. «^'^#•Cli^^v^f:LfcAai:$•lfc!KLT. 
^«>^>n^i>l^eLh{c^^<#t't L^:AS*>'^v^J© 

mcomSk. -^LX\^h^Ui^tb^ib(^ixfimm 

i:m.-rtm.mm.mib. m^mmi^zxtmi 

i:. ^t(i>tU:i:kXff)m0kf\rbmmLT-^<^^ 

immmmb ixsibh^m-nia^wtb . 

[ 0 0 1 2 ] Il««5-C»i. a*«4ffiKc0ffl« 
«?^¥ffiSflS5iatii3V^T, Aia<0WBi<7^SJ:«j^ 

A^$^n-s-ri.tt«#at. Aisowiffifcy^x?:^ 

UcASt. «5^^'<.#-C^: 

\.^b tf:^^M^mt&mmb. mmmizx o 

xlmm^tu::ml^^^bLtl^M^. mimt^^ 
As<^)*fflf^v ^oA# $ i^im-rs wm#a t , ^ 

I^^SK: J: S ASOWK i: / ^ xm-yi^mb <n 



7 

[00 131 $^>(c. m^eTii. m^4Xii5si 

[00 141 

[f^ffli ^mcoismnzxini. mmtkf-^t 

[00 151 ifc. il!«3S2(cJ:taf, |iI£-T-:?» 
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[00161*^. if*«3K:J:*itf. iSS-r-^'S 

[00 171*^. m^Aizxtm. u-aa^atJ: 

i:«!^^#*»S*H^ASit>)aiStti:*«lk«$n. ii 

^t. m^i^Xi'J^lzX^X^i^iifdm^Mkifi/J^l.zX 
i>mb-Wc.LX\.^hi)>i)<^^iih. ^t^lz. AStO 

ABt«!^#S<0¥iJSr3&JM=5:oTV^S%&fc»i. '^.-f^l- 
7-^mm.^^izX 0 . AS<^JBitiJ(tS«l§g^'^i 

Jf. ¥»-^Lt«^(^^fi*JOi:$ii. 
30 *>-7;tJ^<7y«<.-?-;l'T-f<Ofi*«0i0t>::*;S^v>ffii:S<x 
S. ::^iK:a:»)«i^'<.**«3&»<0AS<0WBrtc«^ 

it. fllBf*«-aL^rV^T-^lc«tT«. OX*)i>±t^ 

^ fit7)¥i^ffi^H«-r * ztizX*). ^izmitfi-^ 

•rh^iziiwm^mffmmmiiob^j:'^. ^ma^ 

k 0 , #«!§g^^tg**'-^ii-f fi-ocT^fit UTS 

-hmizx-yx. miii4^^\>^mizmmifti>:it 

izX 0^^< i: «,-oah<^)«gg#^lfcb«-lfclH-& 
[001814^^. ft««5(cj:njf . -o— :><o«i^ 

50 Lyt^fc.ttsrr-^^-C^rv^izLfe/JRfcdsjtlRS 



9 

yM<m^(m\^<^^'^m:%'^iix. x-h^MiK 

±ms.)^mzx mmm.(r>m/hm. mxmmm 

ism±. fMBffi^±'c, ^(/M<^mnmmfm 

t-h-rxmb Lx^mmcnm&^mm-hmmi^ 

/hffl. mmm.(r>^m\izmhw^mf7)im<r>o*>. « 
±m>*h^ti:tsm^mtfi^\.^wmmi^^hx\^h 

b LitX<m^b . W^^%-Qii*j:\^b LfcAOS?^ 

i«i:t6<^-c, i}iEi9*«4aa<^^Ttt. *e-i'Ra 

[0 0 19] tfc. if*«6fc:J:tLtf. m/mkmi 
■thU\,z. YcM-^^mtX-yX. -^(mmmrf—^b 

~^ff)^s&mMLT-ifffyb(rmx. imt^^bLtiK 
at, tsm-^tx^£\^b vti/Mbifim^tL. mtt 

ii. m.m.^mizx*). "fibm^tifzmi}^^ 

ili. iiifciO. mizXhWSHfim^K^btt: 
/Mb. ttfW'^&'Cfi^rv^fcLfeAad^^arCfeofc 
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1 0 

-:?^:0*>?r. mfth:ibifiX^&. 
[0020] 

mti . 0 1 j4*iMB<om 1 (/i^mmtzmtm^ 

S. Hfcfcv^T, l«^l-a5Sffl<i-^*&-^«^* 
;K 7*i»flltt>'^-/7r, Stty-ht^jL-^l'Tfe 

[0021] mmmm ■ mR^m-^^=t'j^-ti^ 
Hi. m^^mtzxhmkt^^ipm^mb/ssiz 
x^x^ii^tLfi.mbi:imti>. 

[00221 Jl«»«S«l^t va-;l/2li. iP^^r< b 
20 t-AJaJbtcJ:oTff*><T^AS<0fllBf<0jg*2:A*L 

[00231 S^^fimt >^i-;l^3»4. b L 

[0024] ^fiis^t s^'a-;i^4 m. 9v^mm. ■ 

\iZ. =fib'&ib(:>fvftVmm:^^^h. 
30 [00251 '^^^yl^ ^ imt i^'a-;W5«. 

t:J:SWlBri:AI5lOWI!lri:*«-^tTi-vS:v>i©^fc:. A 

mziihism-'<.%ti^i}^m<m^>^. 
ifmtim'^t:^ b izm-h^'f-fv^ < m^c^h. 

[00261 ^i%^^yl-r -f tt«ti^j.-;l^6«, —o 

40 [0028] y-hti^'i-yusii, mmw^-yyri 
[00291 ro^««i '5=3:&4^atMt AiTif, « 

mmsm^mb. m\i,'r-9\izn\.xmifim^ 

'<&*>5*><0¥i||li*fTofc|g*-r*5Ha5^tttft«*«A 

[00301 :iixt>ff)Kn'f-s<r>H. wimm^mL 
50 b.^i,m!mbT-i^<-y^'^<^T'-9b<rm:m. 
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1 1 



1 2 



[ 0 0 3 1 1 9[^mmmmi. m4iz5f:rx o 

[ 0 0 3 2 1 m 1 <^)|ISS0«t:iJJt&3i!S<7)j^i: 



<7)AAa 0 0 9eom^. «I^^§T'J>S t L)tAScl4 
5AX'*0. «^^#-C^:V^i:Lfc>Jgc{42Af&& 

[ 0 0 3 4 1 Xx'/Tl 2tC*Sft«.«l^"^#x-^'<0 
L/v:Aoa*»'3 ATJ>S i 3 tr, ASt J: Sf'JlBn&m. 

[19:2] 



> 0. 7 



[00351 ij^tc-o— O<0«|^^i: T-:J"<.-XiO 
^Sei^lSa^T-^*^, >^7^-y r 1 3 1 LT 1 

sa^^MtyitxttiT-^ii'. mz^':>xmi^iih. 30 

^#ffi«i^sm«*>'^<oA*T-:5'*^s#«ii?05aa$ 
iJctc. xr-y 7-1 4 i: ut:?Ka^fii«i • m^^m 

1 5tLx^mm.=t'jA-/i^4izm^i:m*). mt 
mkm^^=e=J:>.-Mzx^p\^<mbifim^j:^x\'^ 40 

S fc # tcli. Sffi««#'<"y 7 r f^mmt^^ b Lti 
/M<7mbtmtK^X^^^bLii:JJk<mi:. ^i-fU 

x^H^ti^'i-;P5{ciSS. Xy'yriGbLX^i- 

[0036] >^T y7'16t:t}(tS, '?.-^;l'T -< <7)H^ 

msm^mmbmsmsi=f:=j:'.-n^ii>(mifim:^5a 



«-5TV^SAAa0 01Oa^r:^. ;^T-y7*16fcJ: 
0. i: LX 1 ?rf6^feS-fr¥i^^;H^-<lf 

»t>'A-^k6'SicOfii&SS. 
[0037] ^^A^i^jL-)U4 b , fh 
mti^'A-/W5tJ:->T#i?,n/i«!m#ffii:AS<0¥ilIBf 
Xx-/7-17fcLT¥%fi<^)if© 

^ffv\ mm^m^(D^JSbmm^mfmBi</)-^(M^ 
•c. mma^. ^^mtzm(i,tix\>^hm^bmm 

[ 0 0 3 8 ] 4x ^.nfe^T<0«!^#ffifc^L. XT>y 
7*1 3— XT-yri 80f <'5jB®*JlTi>^i, Xx-/7'1 
8-1 fc:*jv%T-¥-i. ^fvf:^<r>im^miznLX<m 
mmh'>i:::ibi:m^i>b. »ffifiA-y7r7fc« 
«J^*a<0#^t»«fi*«*Stt$<i, Xx v T 1 9 i: L 

Xx-yTlQttSftSM'^^gxIi. ffAffi. b 

X'%h, ttz. Xx-yT2 0fctTV-h^>''j.-;W8 

ZbX^. Aia<OWIIrfcjgViSeg^ffiA»<i,lgfc:i^^SC 

[0039] XT -/ 7-16 fctjfts'^^^/^r H im<om 

ffy-^mb LTI4, H6tCfeft&AAa 1 0 0 f: A A 
b001i:&m-hb. AAalOOI4^Jt^t'<7)A*»' 

hm^m^hWii: Lx\^6<7)iznu AAb o o i 
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[00401 m7iZ^^i-)lrr ^ sm^i^^-fU 5 
ii. XiJ^tvf^iSmtKttLiiJMbm^^^X^ 10 

f^N' y^y 2 l+c02-:>cOT-^'<r)ttiS:lbBrrS 

Jfcl2tj/a-n'2 2i:. ifcSSLJtl^, /h;&v%:)r<o«* 

^hth^-tiVT- 4 fiJI^t i;3.-;l'2 4 *»^,fit«$ 

[00411 H^ti^'A-yi/SUftSffi 
aoj§StU-Ctt, B8{=*1-J:afc:. Xxyr2 5i: 

v^f:Ufc>jat$rf^^N'>/7r21t:ettU. »:tcxx 
•yr2 6i: LTf^N' y7T 2 l+<0 20^7)7^- rJ'fOfi 
*lfc!lS^:i''i-;i.2 2fc:fev^TJt«L. lt«Ufefe*/h 
$V>ir<Ofiifr^* >''j<.-A'2 3<0^/<»y 7 r 2 3 b 
fc. :*:liV^I53<^fflS:^i«/N'>y7r2 3afc1S«L. 
r -y r 2 7 i: LT^S^^'n'-/ 7 r 2 3 b i: ^\'-/ 7 r 2 
3a<?5^ico^Sr^fV\ ejitXT-y7-2 8fcLT. 
■t;^r -f ffiSE^t i''j2.-;l^2 4(ct>V^T^$*ut:ffl^ 1 
ij^t^HW^ii.mi'^-tlV^im.h.^hV^^X'th. 
[00421 COie^K: iiltf . 06 fctJtti. A A a 30 
1 0 O<oaitcAra<?5fllBr*««0r--^LTv>S<ofc:fcJ6j^, 

■f. m^mfmmwj:hi§r^i.z\i. i - 1 h- e = o . 

83i:^rO. AAbOO l<oat:AS<o*IBnmh.-CV^ 
SJ^^tcil. l-3-H4=0. 2 5i:/h$^:'<^;^r'f 

coffltc^s. ^coig*, m<n^mtmxx\^h^iiz 



[00431 *l«Bfc:t}ft&'<-^;l^ < 0#i.:ir»4. m 

efcfeftSAAa 1 0 oc^fc:. ^5i:^ff<DA*««^ 
aU^rv^fcv^p<Sffl«®?Kli^<, AAbOOli^ 
*^l-«Slfc»l^§^rvW. AHoWBrfc-ScLtti^ 
[00441 ^<o#x.JSrtt, ««<ox:^hoi^-<oftS: 

t LfeAiit«l^-r^§T^rV^t L^A 
SLb i-mfUzmfttfmf'i'y 7 r 3 1 . ^^-C-/ 
7 r 3 1 4»<020<^-^<^»<5»JHKSrlf -^yt^^JR 
^lt«-tS»SASIh»^i^'^-;l'32t. ^^^^ 
i: L;/::A®i:«Jl^^^T^:Ufc Li::/^tfmJ<Mf 
»tj/'jL-;l'3 2*-^,#/5>ii^«|^>ya*>'c.xy hat 
HiS-lt«i-Sxyhat-tffl[^i/i-;P3 3i:. 1 

-^;^f•^^i^«^*i''JL-;^3 4*»'5>fii«Siis. 

[00451 maEWfllJSJ: *}^i>^i-JUT < tWti^A 
-;k5K:iJ{ta5!«l«oi^t LTJ4, Si Otci^^i d 
Xx-y7-3 5tLTA:>J$il7t«8^^§'fcL:tA 
^tWmt^tXtc^^t LtiMLt y 7 T 3 1 

tlSlftU. ^5C^:X•r•!'7•3 6i:UTf^^'•x7T31«^^ 
<7)2oOT-^<7>fi2r jjfg/jan-gL^ey JL-;W3 2 ttJ 

tr-IWt i^a-;W3 3fcliV^TX-f -yrs 7 i: LTlU 
Ttc^{4) iet:J:»)xyhoe-otl«*«^T*>*i.S. 

[00461 

[^31 



(4) 



dClT-^J^Ttxyhotr-ESr, X-f >y7'38i:LT'< 
-^y^r■^ffiS^tyJL-;^3 4t:^^^^T (i-E) i&ft 

AAa 1 0 0<oeifc:ASKol<lBni(p'<Jff-StLTV>S<7)fc: 
tttl^>-f . «?^^ffi<0¥»M^r6J^W4. 1-0. 
9 9 = 0. 01fc/h$^r^^ykr-<<ofifc:^:S. CicrxtS 
Ara<0¥illBi*«S«tTV^S«^tc^<Ara<^«i:Ps 



X\>^iff)tzhm-rm^j:hmii:^^mi'f-i^<o^i- 
[ 0 0 4 7 1 *l6HHom 2 (^>|ltS0i|^aWS . 

01 1 {4. m2<omiimizt5mmm^m{imm 
ffmmtch *) . isssttc p u ^^#b u^n > 

iSO -^'atf7-0^^5A*>^,flt«$il-C(r^6. 



1 5 

[oo4 8i^2coiiiS0re<40i itsttiotc. m 
[ 0 0 4 9 ] A* ^tu^^-^t^i^mR^mcrtm^j Sra 

^filfit*i''jL-;Mt, ft^^iSlciSflBfkAH 

OWiri:*5-gctTV>'5:v^fc:. AS(cJ:l.«^^ 

k. ASfcJ:S«^-^#*»S*><0>JRcoii|^A,. A 

Ik. ASiOWif*<^^^LTV^^^<OAra<^fllSt 
<^)fSfttS:H«-f-&flL«^ffllWt i''*-/J'4 2 k . 
H^$t^jt¥%^-^^^^ ffiki^Wfflk«t«Sfii«^ 

7 T 4 3 ^ioes^f^ai Sffliffik . Wm^t . (S 
«SKB«k*>fe»WfM{t^S^ffi)iflai1)C*5-'A-;P4 

4<7)i omm:i^^~)idf>hmwL^ixh. ^^-ct. ib 

7h'?xT) <^)J!ia*<akUTfl}j£StL-CV^&*«. 

[00501 A:>]T-:?Tfcl.ffiffi«!5g¥S«e^a^ 
kJI-SPttli^Ji. -en-fiimiSLfcllKOllilfifllkH 

^^ar k tcWffi^ff ^:tASc^S^:l.?Fa5P1iSfi 
^SrfflV^SClkt-C^S. 

[00511 m2('mm^mh^<ni]mh lx 

a, 01 2tc^J:dt:. ^■f;^x yr5 IkLTAS 

mm=t'J:L-)V2fiz:^\^x. «g^<^kL^Aa 
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Pi«n»t>''^-;P3kSSttttlt«*i^'jL-;k4 1 
k . «eil^tteti^'*-;k4 2fc:2l^*i.&. 
[00521 ijCtXx-y7-5 2 k LTBmffi* i^'a- 
;l/3K:tJ»r^T. e^^^kL^AcoASk. e^'^: 
#-C=a:v^k LJtA«0ASce5|i}^>/i>. ^mf^-^f-^ 

<^*iOffii^«^K^ k LytA«^k«i^<&-C=5r 

10 fclSttSiiS. 

[00 531 t^i. XTy7'5 2iZtH^h^im-^^T 

[00541 iJct— tJ-ofO^^^kT-rJ"^.-;^* 

mmj^mr-^ifi. 5 3 k lt^i^^ 

#A'>/ 7 r tt, H4 9 K:1f«t5jm#ffi«J^lg* 
20 ^»/9A:^J?ftfcx-^'k, Wm«lllf*5^*jL-;l'3*»4> 
^^.ti.T§/cT-^*^. ^^^^^^oT^Stt$^^S. 20m 

m^tih, d^tzxTyr54bLxmmsm if^ 

X!lLtJitrs«j^#ffi<^^k. il^l8r^:i^'A-/P3 
<0*g«*Jlt«S*i., 2-:?<?Dffi*^Lv^fcH4;^-r-yr 
5 5k LT^fflH^t i^'a-;l'4 tft-ti&jHO . 
»fOkv%dfii&±jSL: ¥^^■^>^T-fH«^>''A-y^ 

30 6K)MS. 

[0 0 551 X-f-y7-54fc:tJ»t61t!S<0|g», «^ 

mtmmm't=j^~fi^3izxm^mi}m^j:^x\.^ 

S k # t:»4. fftSfilft^^N*'!' 7 r 'f^y^m'^^ k U-t 
>Mfkottk«e8^'<^'C*v%kLfcAa<o«t-. 

7" IfKt i^' 5 fcj** . -/ T 5 6 k LT'<.-^ 
J:0. il««j§g#ffikl8«WSi*i^'i-;l/3<Ofi*sS=5r 
[0 0 561 Xx-/r5 6fct>ft5. '<:^;^r^<Ot^^ 

40 ffy-^mt Lxii. mmm 1 onjswttjttsxy 

[00571 y jL-;P4 t'<i-il^ 1 m% 

t i-'i-zP 5 1 J: o-C#^#i^«i^*«k AO<0|ilBf<^ 
-^^^-ffiJi, X-f-yr 5 7 k LT^i^cottir* 
fTV\ ei^#«^tci«-SASk«^#ffi<^^<0- 

(00581 SSfflittH^N'y 7 r 4 1 fciSV^TiiXr 
-/r58kLt:. ttS8t-^§kLfcAS<^k. 

50 ^tx^\.^tLtim<fmf^t>.mm.mm:^o. 
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1 0 0 5 9 J immsmm^<y yr4 2tct>v^ra;^ 

n^AS. Mi-K-A^iil:. /jNa<0ffl{c«t5tASS: 
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Claims 

1. An information retrieval technique evaluation method characterized in that it is an 
evaluation method for the retrieval efficiency of an information retrieval technique used on an 
information retrieval system that acquires retrieved data that match data to be retrieved given in 
advance fi-om a database, that calculates the retrieval efficiency of respective retrieval techniques 
fi-om an external evaluation value that indicates whether the data in the database match the data to 
be retrieved determined by at least one or more persons, and a value as to whether or not the data in 



* [Numbers in right margin indicate pagination in the original foreign language text.] 
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the database match the data to be retrieved determined by the information retrieval technique, and 
that determines the merits of the retrieval technique, 

the number of external values where it is determined by at least one or more persons that 
the data to be retrieved and any one datum in the database match, and the number of external 
values where it is determined that they do not match are each counted, 

if the judgment as to whether or not they agree with the data to be retrieved that is 
determined by the information retrieval technique for the one datum is the same as the judgments 
on the majority side as to whether or not they agree with the retrieval conditions produced by a 
given external evaluation value, a constant value given in advance is used as the data evaluation 
value, and if the same as the minority side, a value obtained with a formula given in advance from 
the number of external evaluation values on the majority and minority sides is used as a one datum 
evaluation value, 

the aforementioned one datum evaluation value is created for the combination of all the 
given data to be retrieved and the data in the database and the average value of all the one datum 
evaluation values is calculated, 

retrieval efficiency is expressed using one index by using said average value as the 
retrieval efficiency evaluation value for the information retrieval technique, 

and the merits of each information retrieval technique are determined by sequencing the 
evaluation value calculated for at least one or more information retrieval techniques. 

2. The information retrieval technique evaluation method mentioned in Claim 1 
characterized in that when the aforementioned one datxmi evaluation value is calculated, the actual 
evaluation value is calculated from the number of people on the minority side and the number of 
people on the majority side in accordance with a predetermined formula, 

a predetermined number of people is subtracted from the number of people on the majority 
side, the value subtracted from the majority side is added to the nxmiber of people on the minority 
side, and an imaginary evaluation value is calculated in accordance with a predetermined formula, 

the difference between the imaginary evaluation value and the actual evaluation value is 
used as the degree of wobble of the one datum evaluation value and the average value of the degree 
of wobble for the combination of all retrieval conditions and the data in the database is calculated, 

the retrieval efficiency with a confidence interval that increases or decreases the value of 
the extent of wobble is calculated for the aforementioned retrieval efficiency value, 

and the merits of the retrieval efficiency of multiple information retrieval techniques are 
judged by calculating the maximum value and the minimum value for the retrieval efficiency with 
said confidence interval and one retrieval efficiency value for each retrieval technique with a 
formula given in advance from the aforementioned data evaluation values. 
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3. The information retrieval technique evaluation method mentioned in Claim 1 or 2 . 
characterized in that the corresponding number of people that agrees with the data to be retrieved 
and the corresponding number of people that do not agree are compared by the external evaluation 
value in the calculation of the aforementioned one datum evaluation value, and when they are the 
same number, a predetermined value is used as the one datum evaluation value to calculate the 
retrieval efficiency of each retrieval technique. 

4. An information retrieval technique device characterized in that it is an information 
retrieval technique evaluation device which is a retrieval efficiency evaluation device for an 
information retrieval technique used on an information retrieval system that acquires retrieved 
data that matches data to be retrieved given in advance, that calculates the retrieval efficiency of 
respective retrieval techniques fi^om an external evaluation value that indicates whether the data in 
the database match the data to be retrieved determined by at least one or more persons, and a value 
as to whether or not the data in the database match the data to be retrieved determined by the 
information retrieval technique, and that determines the merits of the retrieval technique, 

and that is furnished with: an input means for inputting the result of the judgment as to 
whether or not something should be retrieved fi-om at least one or more persons, 

a counting means that counts the number of people that think something should be 
retrieved and the number of people that think something should not be retrieved fi-om said means 
for input of hiunan judgment result between individual retrieval results and individual data to be 
retrieved, 

a determination means that compares the number of people that think something should be 
retrieved counted by said counting means and the number of people that think it should not be 
retrieved, and that determines that said data to be retrieved should be retrieved when the number of 
people that think it should be retrieved is at or above a predetermined ratio, 

a judgment means that judges whether the judgment that the information retrieval 
technique should retrieve for the aforementioned retrieval conditions and the data to be retrieved 
agrees or does not agree, 

a constant value generation means that generates a predetermined constant when the result 
of said judgment is that there is agreement, 

a penalty value generation means that generates a penalty value according to a 
predetermined formula when the result of judgment by the aforementioned judgment means is that 
there is not agreement, 

an average value calculation means that finds the average value of the constant value and 
the penalty value obtained for the combination of all given retrieval conditions and data to be 
retrieved as an average value. 



and a sorting means that places the average values obtained for each retrieval technique in 
a prescribed order. 

5. The information retrieval technique evaluation device mentioned in Claim 4 
characterized in that it is furnished with: a calculation means that determines the magnitude in the 
difference of human judgment according to a predetermined formula after the number of people 
that think that the results of human judgment should be retrieved and the number of people that 
think they should not be retrieved are counted, 

a comparison means that compares the number of people that think something should be 
retrieved and the number of people that think it should not be retrieved in order to provide noise to 
the human judgment, 

an addition and subtraction means that subtracts a predetermined constant value from the 
value of the number of people on the majority side and adds a predetermined constant to the 
number of people on the minority side as a result of comparing the number of people by said 
comparison means, 

a recalculation means that again calculates the magnitude of the difference in judgment 
between humans according to a predetermined formula from the number of people that think it 
should be retrieved that is reset by said addition and subtraction means and the number of people 
that think it should not be retrieved, 

a second average value calculation means that finds the average value of the result of 
calculating the magnitude of the difference between the human judgment produced by the 
aforementioned recalculation means and the judgment containing noise for the combination of all 
retrieval conditions and retrieval data, 

a minimum and maximum value calculation means that calculates the minimum value and 
maximxun value of the average value with a predetermined formula from the average value 
obtained by the aforementioned average value calculation means and the average value obtained 
by the aforementioned second average value calculation means, 

an order calculation means that calculates the respective order of each retrieval technique 
by arranging the average value, minimum value, and maximum value in order from the smallest, 

and an order determination means that uses the order value of each retrieval technique for 
each of the three values of average value, minimvim value and maximum value as one value with a 
predetermined formula. 

6. The information retrieval technique evaluation device mentioned in Claim 4 or 5 
characterized in that it is fiimished with: a comparison means that compares the number of people 
that think it should be retrieved and the nimiber of people that think it should not be retrieved 
between one datum to be retrieved and one [more] datum to be retrieved, 
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and a constant generation means that generates a predetermined value if the result of 
comparison by said comparison means is the same number. 

Detailed explanation of the invention 
[0001] 

Industrial application field 

This invention relates to an information retrieval technique evaluation method and device 
for same for automatically selecting the optimal information retrieval technique when multiple 
information retrieval techniques are available. In particular, it relates to an evaluation method and 
device for a database retrieval technique where the relationship between retrieval conditions and 
data that should be retrieved changes often according to [the needs of the] user, as represented by a 
current events information database, such as for newspaper articles. 



Prior art 

In the past, in evaluating information retrieval techniques, two indices, called the recall 
ratio represented by Formula (1) below and the relevance factor represented by Formula (2), would 
have been judged in combination. (There are many, for example 



[0002] 



Gerand Saltoii (Ed.). The SMABT Retrieval 5 
ysUm - Ekpei^KRts in AutcuMitk i)ocinient rrocessi 
ng. Englewood Cliffs, NJ, Print! ce-Hall, IffTl. B.M 
asaad, G.UiMiff, D.«altz. Qassiftring News Stories 
tisii« HeiKxry Based Rea»niiig. Proceediii^, I5th I 
nt'I SIGIfi 1992. pp- 99-56,1992 



Formula (1) 




Key: 1 



Recall ratio 

Data that people think matches retrieved data 

Data judged to match the retrieved data by the retrieval technique 

Data that people think matches the retrieval conditions 



2 
3 
4 
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Fonnula (2) 




Key: 



1 
2 
3 
4 



Relevance factor 

Data that people think matches the retrieval conditions 

Data judged to match the retrieval conditions by the retrieval technique 

Data judged to match the retrieval conditions by the retrieval technique 



[0003] 

Problems to be solved by the invention 

However, with indices produced by the recall ratio and the relevance factor, the numerator 
values are the same and the difference in denominator value only affects the indices. The 
numerator value with the recall ratio is the number of data that should be retrieved based on human 
judgment, and in the case of the relevance factor, it is the number of data that should be retrieved in 
the judgment of the retrieval technique. When the retrieval technique conditions are relaxed to 
increase the recall ratio without changing human judgment, many results will be retrieved, so the 
relevance factor drops. Conversely, when the relevance factor is increased, the retrieval technique 
conditions will become more strict, so there is a tradeoff relationship in recall ratio and relevance 
factor. For this reason, when the merits and disadvantages of various retrieval techniques are 
evaluated, which technique is suitable could not be judged by whether a technique with higher 
relevance factor or a technique with higher recall ratio should be used. 



In particular, the relevance factor uses a "value produced by the retrieval technique" for the 
denominator. So for example, even if the same value of 30% is used, in a case where 9 are correct 
in the judgment of humans, 30 data are obtained as the retrieved result and all 9 that humans think 
should be retrieved are included, and a case where only 10 data are retrieved and 3 correct data are 
included, it is not possible to judge which case represents the better retrieval technique. 



[0004] 



[0005] 

In the recall ratio, too, even with retrieval from the same retrieved data by a person who is 
retrieving, such as for current events information, when the content of the retrieval results they 



want are different, the denominator of the recall factor changes, and the value of the recall factor 
changes, which would be a problem. 

[0006] 

This problem, when the retrieval efficiency evaluation method is represented as in Figure 2, 
is caused by the fact that the data that a person thinks should be retrieved only receives attention 
when it should be retrieved in response to the information retrieval technique, and the fact that the 
human judgment shifts and data that should not be retrieved are not retrieved is not included in the 
evaluation technique. 

[0007] 

The purpose of this invention, in consideration of the aforementioned problems, is to 
provide an information retrieval technique evaluation method and device for same that can 
accurately judge the optimal information retrieval technique. 

[0008] 

Means to solve the problems 

With this invention, with Claim 1 for achieving the aforementioned purpose, an 
information retrieval technique evaluation method is proposed that is an evaluation method for the 
retrieval efficiency of an information retrieval technique used on an information retrieval system 
that acquires retrieved data that match data to be retrieved given in advance from a database, that 
calculates the retrieval efficiency of respective retrieval techniques from an extemal evaluation 
value that indicates whether the data in the database match the data to be retrieved determined by 
at least one or more persons, and a value as to whether or not the data in the database match the 
data to be retrieved determined by the information retrieval technique, and that determines the 
merits of the retrieval technique, the number of extemal values where it is determined by at least 
one or more persons that the data to be retrieved and any one datum in the database match, and the 
number of extemal values where it is determined that they do not match are each counted, if the 
judgment as to whether or not the data to be retrieved that is determined by the information 
retrieval technique for the one datum is matched is the same as the judgments on the majority side 
as to whether or not retrieval conditions produced by a given extemal evaluation value are matched, 
a constant value given in advance is used as the data evaluation value. If it is the same as the 
minority side, a value obtained with a formula given in advance from the number of extemal 
evaluation values on the majority and minority sides is used as a one datum evaluation value. The 
aforementioned one datum evaluation value is created for the combination of all the given data to 
be retrieved and the data in the database and the average value of all the one datum evaluation 
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values is calculated. Retrieval efficiency is expressed using one index by using said average value 
as the retrieval efficiency evaluation value for the information retrieval technique, and the merits 
of each information retrieval technique are determined by sequencing the evaluation value 
calculated for at least one or more information retrieval techniques. 

[0009] 

And with Claim 2, in the information retrieval technique evaluation method mentioned in 
Claim 1, an information retrieval technique evaluation method is proposed such that when the 
aforementioned data evaluation value is calculated, the actual evaluation value is calculated fi-om 
the number of people on the minority side and the number of people on the majority side of the 
external evaluation value according to a predetermined formula, a predetermined number of 
people is subtracted from the number of people on the majority side, the value subtracted from the 
majority side is added to the number of people on the minority side, and an imaginary evaluation 
value is calculated according to a predetermined formula. The difference between the imaginary 
evaluation value and the actual evaluation value is used as the extent of wobble of the one datum 
evaluation value. The average value of the extent of wobble for the combination of all the retrieval 
conditions and the data in the database is calculated and retrieval efficiency with a confidence 
interval where the extend of wobble in the value of the aforementioned retrieval efficiency value is 
increased and decreased is calculated. The merits of retrieval efficiency of multiple information 
retrieval techniques are judged by calculating one retrieval efficiency value for each retrieval 
technique with a predetermined formula from the maximum value and minimum value of said 
retrieval efficiency with confidence interval and the aforementioned data evaluation value. 

[0010] 

And with Claim 3, in the information retrieval technique evaluation method mentioned in 
Claim 1 or 2, an information retrieval technique evaluation method is proposed where, in the 
calculation of the aforementioned data evaluation value, the number of corresponding people that 
agree with the retrieval data and the corresponding number of people that do not agree are 
compared by the external evaluation value, and when they are the same, the retrieval efficiency of 
each retrieval technique is calculated with a predetermined value as a one datum evaluation value. 

[0011] 

Also, with Claim 4, an information retrieval technique evaluation device is proposed using 
an information retrieval technique used on an information retrieval system that acquires retrieved 
data that match data to be retrieved given in advance, and is an information retrieval technique 
evaluation device that calculates the retrieval efficiency of respective retrieval techniques from an 
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external evaluation value that indicates whether the data in the database match the data to be 
retrieved determined by at least one or more persons, and a value as to whether or not the data in 
the database match the data to be retrieved determined by the information retrieval technique, and 
that determines the merits of the retrieval technique. There are furnished: an input means for 
inputting the result of the judgment as to whether or not something should be retrieved from at 
least one or more persons, a counting means that counts the number of people that think it should 
be retrieved and the number of people that think it should not be retrieved from said means for 
input of human judgment result between individual retrieval results and individual data to be 
retrieved, a determination means that compares the number of people that think something should 
be retrieved counted by said counting means and the number of people that think it should not be 
retrieved, and that determines that said data to be retrieved should be retrieved when the number of 
people that think it should be retrieved is at or above a predetermined ratio, a judgment means that 
judges whether the judgment that the information retrieval technique should retrieve for the 
aforementioned retrieval conditions and the data to be retrieved agrees or does not agree, a 
constant value generation means that generates a predetermined constant when the result of said 
judgment is that there is agreement, a penalty value generation means that generates a penalty 
value according to a predetermined formula when the result of judgment by the aforementioned 
judgment means is that there is not agreement, an average value calculation means that finds the 
average value of the constant value and the penalty value obtained for the combination of all given 
retrieval conditions and data to be retrieved as an average value, and a sorting means that places 
the average values obtained for each retrieval technique in a prescribed order. 

[0012] 

Also, with Claim 5, an information retrieval technique evaluation device is proposed such 
that, in the information retrieval technique evaluation device mentioned in Claim 4, there are 
furnished: a calculation means that calculates the magnitude in the difference of human judgment 
according to a predetermined formula after the number of people that think that the results of 
human judgment should be retrieved and the number of people that think they should not be 
retrieved are coimted, a comparison means that compares the number of people that think it should 
be retrieved and the number of people that think it should not be retrieved in order to provide noise 
to the human judgment, an addition and subtraction means that subtracts a predetermined constant 
value from the value of the number of people on the majority side and adds a predetermined 
constant to the number of people on the minority side as a result of comparing the number of 
people by said comparison means, a recalculation means that again calculates the magnitude of the 
difference in judgment between humans according to a predetermined formula from the number of 
people that think it should be retrieved that is reset by said addition and subtraction means and the 
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number of people that think it should not be retrieved, a second average value calculation means 
that finds the average value of the result of calculating the magnitude of the difference between 
the human judgment produced by the aforementioned recalculation means and the judgment 
containing noise for the combination of all retrieval conditions and retrieval data, a minimum and 
maximum value calculation means that calculates the minimum value and maximum value of the 
average value with a predetermined formula fi"om the average value obtained by the 
aforementioned average value calculation means and the average value obtained by the 
aforementioned second average value calculation means, an order calculation means that 
calculates the respective order of each retrieval technique by arranging the average value, 
minimum value, and maximum value in order firom the smallest, and an order determination means 
that uses the order value of each retrieval technique for each of the three values of average value, 
minimimi value and maximum value as one value with a predetermined formula. 

[0013] 

In addition, with Claim 6, an information retrieval technique evaluation device is proposed 
where, in the information retrieval technique evaluation device mentioned in Claim 4 or 5, a 
comparison means is fimiished that compares the number of people that think it should be 
retrieved and the number of people that think it should not be retrieved, between one datum to be 
retrieved and one [more] datum to be retrieved, as well as a constant generation means that 
generates a predetermined value if the comparison results of said comparison means are the same 
number. 

[0014] 
Operation 

With Claim 1 of this invention, the number of extemal evaluation values where it is 
determined by at least one or more persons that data to be retrieved and any one datum in a 
database match the data to be retrieved, and the number of extemal evaluation values where it is 
determined that they do not match are each counted. If the judgment as to whether or not there is 
agreement with the data to be retrieved determined by the information retrieval technique for the 
one datum is the same as the majority side of judgments as to whether or not there is agreement 
with the retrieval conditions based on the given extemal evaluation values, a constant value given 
in advance is used as the one datum evaluation value. If it is the same as the minority side, a value 
obtained with a formula given in advance fi^om the number of extemal evaluation values on the 
majority side and the minority side is used as the data evaluation value, and the aforementioned 
one datum evaluation value is created for the combination of all given data to be retrieved and the 
data in the database. In addition, the average value of all the data evaluation values is calculated. 
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and by using this as the retrieval efficiency evaluation value for the information retrieval 
technique, the retrieval efficiency is represented using one index. By sequencing the valuation 
values calculated for at least one or more information retrieval techniques, the merits of each type 
of information retrieval technique are determined. 

[0015] 

Also, with Claim 2, when the aforementioned one datum evaluation value is calculated, an 
actual evaluation value is calculated from the value of the number of people on the minority side 
and the number of people on the majority side of the external evaluation value according to a 
predetermined formula. A predetermined number of people is subtracted from the number of 
people on the majority side, the value subtracted from the majority side is added to the number of 
people on the minority side, and an imaginary evaluation value according to a predetermined 
formulate is calculated. The difference between said imaginary evaluation value and the actual 
evaluation value is used as the extent of wobble of the one datimi evaluation value. The average 
value of the extent of wobble for the combination of all retrieval conditions and the data in the 
database is calculated, the retrieval efficiency with a confidence interval where the value of the 
extent of wobble for the aforementioned retrieval efficiency value is further increased or decreased 
is calculated, and one retrieval efficiency value is calculated for each retrieval technique with a 
predetermined formula from the maximum value and minimum value of said retrieval efficiency 
with confidence interval and from the aforementioned one datum evaluation value so that the 
merits of retrieval efficiency of multiple information retrieval techniques are judged. 

[0016] 

Also, with Claim 3, in the calculation of the aforementioned data evaluation value, the 
corresponding nvmiber of people that agree with the data to be retrieved and the corresponding 
number of people that do not agree is compared by the extemal evaluation value. When they are 
the same number, a predetermined value is used as the one datum evaluation value and the 
retrieval efficiency of each retrieval technique is calculated. 

[0017] 

Also, with Claim 4 whether said data to be retrieved should be retrieved or not retrieved is 
determined by the counting means from data for the judgment of at least one person as to whether 
or not something input by an input means should be retrieved. So for judgments as to whether not 
something should be retrieved as judged by a human, the number of people for each is counted, a 
predetermined value forjudging whether or not it should be retrieved and the measured value of 
the ratio of the number of people that think it should or should not be retrieved are compared by a 
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determination means, and it is determined whether said data to be retrieved are data that should be 
retrieved. Next, the judgment result as to whether or not the input information retrieval system 
should be performed and the judgment result as to whether or not something should be retrieved 
determined from human judgment are compared by the judgment means, and it is determined 
whether the judgment result obtained by the retrieval system is in agreement with the judgment 
produced by humans. In addition, when the human judgment and the judgment by the retrieval 
technique agree, a predetermined constant value is given by a constant value generation means. 
When the human and retrieval technique judgments are different, a penalty value is calculated with 
a predetermined formula from the value of the ratio of the various numbers of people for the 
judgments as to whether or not something should be retrieved in the judgment of the humans. For 
example, assuming that the constant value when the judgments agree is 0, the penalty value when 
the do not agree should be a value larger than 0. Because of this, when the judgment by the 
retrieval technique agrees completely with the human judgment as to whether or not something 
should be retrieved, 0 is always given to individual retrieval conditions and to individual data to be 
retrieved. For data when the judgments do not agree, a value larger than 0 is obtained, so by 
calculating the average value of the penalty values with an average value calculation means, the 
evaluation value of retrieval technique will be 0 when the judgments agree perfectly, penalty 
values are used that are larger for retrieval techniques that have more cases where the judgments 
do not agree, and the capability of each retrieval technique is expressed numerically as one 
numerical value. In addition, by sequencing the penalty values calculated, for example, in order of 
the smallest, with a sorting means, the capability of at least one or more retrieval techniques can be 
compared. 

[0018] 

Also, with Claim 5, when a judgment as to whether or not to retrieve is performed using 
individual retrieval conditions and individual data to be retrieved, an actual evaluation value, that 
is, the magnitude of the difference in judgments between people, is calculated with a 
predetermined formula from the ratio of the number of people that think it should be retrieved and 
the number of people that think it should not be retrieved by a calculation means. The number of 
people that think it should be retrieved and the number of people that think it should not is 
compared by a comparison means. Based on said comparison results, a predetermined number of 
people is subtracted from the number of people on the majority side of the number of people who 
think it should be retrieved and the number of people who think it should not be retrieved, and a 
predetermined number of people is added to the minority side by an addition and subtraction 
means. The result is again calculated by a recalculation means in accordance with a predetermined 
formula from the ratio of the changed number of people, an imaginary evaluation value, that is, the 
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magnitude of the difference in human judgments including noise is calculated, and the results of 
judgments by humans that are input and the results with noise included in the judgment are created. 
In addition, the magnitude of the differences of these two judgments is measured with a 
predetermined formula, and when the human judgment is slightly different, how much the penalty 
value will be affected is calculated. The effect of this value on the ratio of the mmiber of people is 
large when the judgment of whether or not something should be retrieved is made by a sufiBciently 
large number of people, even when an extremely small number of people move to the majority side 
or the minority side, and the effect of the value on the calculated penalty value is also large. Thus, 
by calculating the magnitude of the difference between the human judgment with noise added and 
the human judgment input, to what extent confidence can be placed in it can be numerically 
expressed. The minimum value and maximxmi value of the evaluation value are calculated with a 
minimum and maximum value calculation means from the retrieval efficiency evaluation value 
and the value that indicates the degree of confidence. When the degree of confidence is low, the 
retrieval efficiency can be represented by data for a broad range of evaluation values, and when the 
degree of confidence is high, by data for a narrow range of evaluation values. In addition, with an 
order calculation means that sequences the efficiency values, and with an order determination 
means that calculates the order of retrieval efficiency as one value fi-om the order in each value 
with a predetermined formula, between minimum values, maximum values, and evaluation values, 
with an evaluation value with a low degree of confidence, for example, the possibility that the 
maximum value will be very large is high. So within the order of retrieval techniques for each of 
the values of maximum value, minimum value, and evaluation value, it is judged that a retrieval 
technique with a smaller maximum value indicates better retrieval efficiency. The result is that 
even when there is a difference in judgments by people, the merits and disadvantage of one or 
more retrieval techniques can be determined automatically. Because of this, when the results 
produced by an information retrieval technique and human judgment are different, a penalty value 
is calculated from the ratio of people who think something should be retrieved and the ratio of 
people who think it should not be retrieved, but if the value of this penalty is viewed statistically, 
the reliability changes according to the number of people making the judgment. So with the device 
mentioned in aforementioned Claim 4, a judgment as to whether or not something should be 
retrieved must necessarily be made by the same number of people, but with the device mentioned 
in Claim 5 there is no such restriction. 

[0019] 

Also, with Claim 6, when the majority and minority are determined, the nvmiber of people 
who think that something should be retrieved and the number of people who think it should not be 
retrieved between one datum to be retrieved and one [more] datum to be retrieved are compared by 
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a comparison means. Based on the results of said comparison, when the nimiber of these people is 
the same, a predetermined constant is generated by a constant generation means, so that an 
evaluation value is calculated even when there is no majority or minority. Because of this, even 
when the number of people judge that it should be retrieved and the number of people that think it 
should not be retrieved are the same, it is possible to judge between data that should be retrieved 
and data that should not be retrieved. 

[0020] 

AppUcation example 

Below, an application example of this invention will be explained with reference to the 
figures. Figure 1 is a block diagram that shows an information retrieval technique evaluation 
device in a first application example of this invention. Said device is constituted fi^om a computer 
that primarily a CPU and fi-om programs. In the figure, (10) is an external evaluation value and 
retrieval technique agreement determination module, (2) is an extemal evaluation value totaling 
module, (3) is a relational judgment module, (4) is a constant value generation module, (5) is a 
penalty calculation module, (6) is an average penalty calculation module, (7) is an evaluation value 
buffer, and (8) is a sorting module. Note that here, each module is constituted as a processing unit 
of a program (software), but they can also be constituted with hardware. 

[0021] 

Extemal evaluation value and retrieval technique agreement determination module (1) 
compares the judgment as to whether or not something should be retrieved produced by the 
retrieval technique and the judgment given by a human. 

[0022] 

Extemal evaluation value totaling module (2) receives input of the results of human 
judgment made by at least one or more persons and counts the number of people that think it 
should be retrieved and the number of people that think it should not be retrieved. 

[0023] 

Relational judgment module (3) determines whether or not the data should be retrieved 
fi-om the number of people who think it should be retrieved and the number of people who think it 
should not be retrieved. 

[0024] 
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Constant value generation module (4) generates a predetermined example numerical value 
when the judgment produced by the retrieval technique and the human judgment agree as 
determined in external evaluation value and retrieval technique agreement determination 
module (1). 

[0025] 

Penalty calculation module (5) determines the penalty value for the fact that the retrieval 
technique considered a wrong judgment from the ratio of the number of people for whether or not 
something should be retrieved produced by humans when the judgment produced by the retrieval 
technique and the himian judgment do not agree. 

[0026] 

Average penaUy calculation module (6) finds the average of the penalty values calculated 
for the combination of all the retrieval conditions and the retrieval data for one retrieval technique. 

[0027] 

Evaluation value buffer (7) stores the average values calculated as evaluation values along 
with the retrieval technique. 

[0028] 

Sorting module (8) rearranges the retrieval techniques in evaluation value buffer (7) in 
order of the smaller evaluation values. 

[0029] 

With the application example that is constituted as described above, when the capability of 
a retrieval technique is measured, a table of the information retrieval technique retrieval results 
which is the database retrieval results produced with each type of retrieval technique and an 
external evaluation table that is the result of a judgment as to whether or not something should be 
retrieved made by a human for the same data are externally input in advance. 

[0030] 

Of these input data, the table of information retrieval technique retrieval results, as shown 
in Figure 3, is composed of a technique number for the retrieval technique, an identifier that 
represents the set of each retrieval condition and each datum in the database, and a judgment 
flag that represents whether or not said retrieval technique should retrieve the set represented by 
each identifier. 
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[0031] 

An external evaluation value table, as shown in Figiu-e 4, is composed of an identifier that 
represents the set of each retrieval condition and each datum in the database, and a judgment flag 
that indicates whether or not each person thinks each should be retrieved for the set of retrieval 
conditions and data represented by each identifier. 

[0032] 

For the process in the first application example, as shown in Figure 5, first, as step 1 1, the 
number of people that think it should be retrieved and the number of people that think it should not 
be retrieved are each counted in external evaluation value totaling module (4) for the data in the 
external evaluation value table that is the result judged by humans. Next, as step 12, it is judged 
whether to be data that should be retrieved or data that should not be retrieved fi-om the ratio of the 
number of people that think it should be retrieved and the number of people that think it should not 
be retrieved in relational judgment module (3). The value of the judgment result is sent to external 
evaluation value and retrieval technique agreement determination module (1) along with the 
number of people that think it should be retrieved and the number of people who think it should not 
be retrieved, where they are stored in an extemal evaluation value buffer (shown in Figure 6) 
provided for extemal evaluation value and retrieval technique agreement determination 
module (1). 

[0033] 

A method forjudging whether or not to retrieve at step 12 described above can be realized 
by comparing the number of people that think something should be retrieved and the number of 
people who think it should not be retrieved and letting the majority side [decide]. For example, in 
the case of AAa009 in Figure 4, the number of people who think it should be retrieved is 5, and the 
number of people who think it should not be retrieved is 2, so it is determined to be data that should 
be retrieved. 

[0034] 

As another method for deciding data to be retrieved at step 12, when the judgment fi-om 
humans is split, such as when the number of people that think AAa002, for example, should be 
retrieved is 4, and the number of people who think it should not be retrieved is 3, there is the 
possibility that a large amount of meaning information will be provided. So assuming that such 
data should not be retrieved, and for example, setting the threshold at 70% beforehand, when 
following Formula (3) is satisfied, 



17 



Formula 2 




> 0.7 .-(39 



Key: 1 
2 
3 



Number of people who think it should be retrieved 
Number of people who think it should be retrieved 
Number of people who think it should not be retrieved 



[the method] can be realized by determining that it is data that should be retrieved. 



[0035] 



Next, data in an information retrieval technique retrieval result table given for individual 
retrieval conditions and individual data in a database are sent to external evaluation value and 
retrieval technique agreement determination module (1) as step 13 and stored in sequence in an 
evaluation value storage buffer in the module. The data input from the information retrieval 
technique retrieval result table and the data sent from relational judgment module (3) are stored 
together as shown in Figure 6 in the evaluation value storage buffer. When two or more retrieval 
techniques are evaluated, the input data from the information retrieval technique retrieval result 
table are replaced and processed in sequence. Next, as step 14, the results of the retrieval technique 
for each retrieval condition/data identifier and the results from relational judgment module (3) are 
compared in external evaluation value and retrieval technique agreement determination module 
(1). When the two values are equal, a signal is sent to constant value generation module (4) as step 
15, and a value of 0, for example, is generated and sent to average penalty calculation module (6). 
When the results of comparison at step 14 is that the retrieval technique and the value of the 
determination from the relational judgment module are different, the number of people that think it 
should be retrieved and the number of people who think it should not be retrieved in the evaluation 
value storage buffer are sent to penalty calculation module (5). As step 16, the penalty when the 
information retrieval technique and the value of relational judgment module (3) are different is 
calculated by a predetermined formula in penalty calculation module (5). 



One actual method of penalty calculation at step 16 can be realized by generating a 
predetermined constant value. For example, in the case of AAaOOl where the value of the 
information retrieval technique retrieval result and the value from the relational judgment module 
are different, as shown in Figure 6, 1 is generated as the constant value, for example, by step 16 
and this value is sent to average penalty calculation module (6). 



[0036] 
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[0037] 

The average value of the values that indicate the degree of matching between the retrieval 
technique and human judgment obtained by constant value generation module and penalty 
calculation module (5) is calculated as step 17 and an evaluation value with the degree of matching 
of human and retrieval technique for the entire retrieval technique is expressed as a numerical 
value. Next, as step 18, the evaluation value is stored in evaluation value buffer (7) along with a 
number assigned to the retrieval technique. 

[0038] 

The processing between step 13 to step 18 if performed for all given retrieval techniques. 
When it is determined that processing for all given retrieval techniques is completed at step 18-1, 
the retrieval technique number and evaluation value are stored in evaluation value buffer (7), and 
they are rearranged in order of smaller evaluation value in sorting module (8) as step 19. 
Rearrangement at step 19 can easily be realized with existing technology, such as insertion, heap 
sort, quick sort, or the like. By outputting the retrieval technique numbers arranged in order of 
smaller evaluation value as step 20, they can be selected in order from the retrieval techniques 
closer to human judgment. 

[0039] 

As another method for realizing penalty calculation at step 16, when AAalOO and AAbOOl 
in Figure 6 are compared, in contrast to the fact that the result from the retrieval technique gives a 
different judgment, and despite the fact that nearly all the people think that AAalOO should not be 
retrieved, with AAbOOl, the number of people that think it should be retrieved and the number of 
people that think it should not be retrieved are extremely close, so it cannot be said that the 
judgment by the retrieval technique differs from that of the humans whether to retrieve or 
not. So by changing the penalty value according to the number of people, giving a larger penalty 
when there is great variation from the human opinion, and giving a smaller penalty when the 
human judgment is ambiguous, classification can be accomplished more precisely in order starting 
with the retrieval techniques closer to human judgment. To realize this, for example, penalty 
calculation module (5) would be configured as shown in Figure 7. 

[0040] 

Penalty calculation module (5) shown in Figure 7 is composed of working buffer (21) that 
temporarily stores the number of people that think something should be retrieved and the number 
of people that think it should not be retrieved that are input, comparison module (22) that compares 
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the two data in working buffer (21), division module (23) that divides the compared resuU using 
the value of the smaller as the numerator and the value of the larger as the denominator, and 
penalty value generation module (24) that uses the value of the divided value subtracted from 1 as 
the penalty value. 

[0041] 

The processing in penalty calculation module (5) is accomplished as follows. As shown in 
Figure 8, the number of people that think it should be retrieved and the number of people that think 
it should not be retrieved that are input are stored in working buffer (21) as step 25. Next, as step 
28, the values of the two data in working buffer (21) are compared in comparison module (22) and 
the compared result is stored with the smaller value in numerator buffer (23b) of division module 
(23) and the larger value in denominator buffer (23a). As step 27, the values in numerator buffer 
(23b) and denominator buffer (23a) are divided. Finally as step 28, the value divided in penalty 
value generation module (24) subtracted from 1 is used as the penalty value. 

[0042] 

With this method of realization, when the judgment by the retrieval technique is different, 
despite the fact that the human judgments nearly all agree, as for AAalOO in Figure 6, 1 - 1 6 = 
0.83 is given. When the human opinions are split as for AAbOOl, 1-3^-4 = 0.25 is given, which is 
a smaller penalty value. The result will be that when the human judgment is split, the penalty value 
for techniques that will give many of the human judgments will be small, and the average penalty 
value for techniques that show different judgments despite the fact that the human judgments agree 
will be larger. So both can be clearly separated. 

[0043] 

The idea of the penalty in this invention is based on the fact that if the result produced by a 
retrieval technique is to give a different judgment, despite the fact that nearly all the humans think 
that it should not be retrieved, as for AAalOO in Figure 6, the degree of importance of the 
information that the retrieval technique does not agree with human judgment is high. If the number 
of people that think it should be retrieved and the number of people that think it should not be 
retrieved is very competitive, as for AabOOl, it cannot be determined unconditionally whether or 
not it should be retrieved, so the importance of the information that there is not agreement with 
human judgment can be considered small. 



[0044] 
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This idea can also be realized with a technique where processing at step 16 applies an 
entropy formula, since there is a portion similar to the concept of information entropy. One 
application example where an entropy formula is applied can be realized by the configuration as 
shown in Figure 9. One example of penalty calculation module (5) shown in Figure 9 is composed 
of working buffer (31) that temporarily stores the number of people that think it should be 
retrieved and the number of people that think it should not be retrieved that are input, comparison 
module (22) that compares the two data in working buffer (31), calculation module (32) that 
calculates the total number of people that made a judgment from the values of the two data in 
working buffer (31), entropy calculation module (33) that calculates the entropy value from the 
number of people that think it should be retrieved, the number of people that think it should not be 
retrieved, and the number of people responding fi-om obtained fi-om number of people responding 
calculation module (32), and penalty value generation module (34) that uses the entropy value 
subtracted from 1 as the penalty value. 

[0045] 

As processing in the penalty calculation module (5) that is configured as described above, 
as shown in Figure 10, as step 35, the number of people that think something should be retrieved 
and the number of people that think it should not be retrieved that are input are stored in working 
buffer (31). Next, as step 36, the values for the two data in working buffer (31) are added together 
to calculate the number of people in number of people responding calculation module (32). Next as 
step 37, entropy calculation is performed with Formula (4) shown below in entropy calculation 
module (33). 

[0046] 



Formula 3 




Key: 1 Number of people that think something should be retrieved 
2 Solved number of people 

(1-E) is calculated for entropy (E) found here in penalty value generation module (34) as step 38 to 
accomplish penalty calculation. With this method of realization, when the retrieval technique 
judgment differs, despite the fact that the judgments of the humans nearly all agree as for AAalOO 
in Figure 6, a small penalty value of 1 - 0.99 = 0.01 will be given. The result is that the penalty 
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value for a technique that differs from many human judgments when the human judgments are 
spKt will be smaller. With a technique that shows a different judgment, despite the fact that the 
human judgments nearly all agree, the average penalty value will be larger, so both can be clearly 
separated. 

[0047] 

Next, a second application example of this invention will be explained. Figure 1 1 is a block 
diagram of an information retrieval technique evaluation device in a second application example. 
Said device is composed of a computer, which is primarily a CPU, and programs. 

[0048] 

With this second application example, as shown in Figure 11, nearly the same 
configuration as the first application example shown in Figure 1 is used. In this application 
example, in the information retrieval technique retrieval result table, which is the retrieval result 
for a database produced by each type of retrieval technique, and an extemal evaluation value table, 
which is the result of judgment by humans as to whether or not the same data should be retrieved, 
are externally input. 

[0049] 

To measure the capability of a retrieval technique from the input data, the information 
retrieval technique evaluation device in the second application example is composed of 10 
modules: extemal evaluation value and retrieval technique agreement determination module (1) 
that compares the judgment as to whether or not something should be retrieved produced by the 
retrieval technique and the judgments given by humans, extemal evaluation value totaling module 
(2) into which the result of human judgment made by at least one or more persons is input and that 
totals the number of people that think something should be retrieved and the number of people that 
think it should not be retrieved, (3) relational judgment module (3) that determines whether or not 
something should be retrieved from the ratio of the number of people that think it should be 
retrieved and the number of people that think it should not be retrieved, constant value generation 
module (4) that generates a predetermined constant value when the judgment produced by the 
retrieval technique and the human judgment agree as determined by extemal evaluation value and 
retrieval technique agreement determination module (1), penalty calculation module (5) that 
determines a penalty value for the fact that the retrieval technique has made an incorrect judgment 
from the ratio of people that think it should or should not be retrieved when the judgment produced 
by the retrieval technique and the human judgment do not agree, average penalty calculation 
module (6) finds the average of the penalty values calculated for combination of all the retrieval 
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conditions and data to be retrieved for one retrieval technique, actual evaluation value calculation 
module (41) that calculates the amount of information for human judgments from the ratio of 
people that think something should or should not be retrieved, imaginary evaluation value 
calculation module (42) that calculates the amount of information for human judgments for when 
human judgment changes slightly, intermediate buffer (43) that stores the calculated average 
penalty value and the actual evaluation value and imaginary evaluation value as a set with the 
retrieval technique, and technique order calculation module (44) that sequences the retrieval 
technique in intermediate buffer (43) from the evaluation value, actual evaluation value, and 
imaginary evaluation value. Here too, the same as in the first application example, each module is 
also constituted as a program (software) processing unit, but they can also be constituted with 
hardware. 

[0050] 

The information retrieval technique retrieval result table and the external evaluation value 
table, which are input data, are each the same as in the first application example described above, 
so they will be omitted. However, in the first application example, there was only one extemal 
evaluation value table for all the retrieval techniques being evaluated, but in this second 
appHcation example, an extemal evaluation value table with a different number of people 
performing evaluation can also be used for each evaluation technique. 

[0051] 

As the processing in the second application example, as shown in Figure 2, first, as step 5 1 , 
for the data in the extemal evaluation value table, which is the result of judgment by humans, the 
number of people that think it should be retrieved and the number of people that think it should not 
be retrieved are each counted in eternal evaluation value totaling module, and [this] is sent to 
relational judgment module (3) and to actual evaluation value calculation module (41) and 
imaginary evaluation value calculation module (42). 

[0052] 

Next as step 52, it is determined in relational judgment module (3) whether it is data that 
should be retrieved or data that should not be retrieved, from the ratio of the number of people that 
think it should be retrieved and the number of people that it should not be retrieved. The value for 
the result of the judgment is sent to extemal evaluation value and retrieval technique agreement 
determination module (1) along with the total of people that think it should be retrieved and the 
people that think it should not be retrieved, where it is stored in the extemal evaluation value buffer 
shown in Figure 6. 
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[0053] 

The method forjudging whether or not something should be retrieved at step 52 is the same 
method as used in the first application example, so an explanation will be omitted. 

[0054] 

Next, data in the conditional retrieval technique retrieval result table given for individual 
retrieval condition and individual data in the database are sent to external evaluation value and 
retrieval technique agreement determination module (1) as step 53, where they are stored in order 
in an evaluation value storage buffer in the module. The evaluation value storage buffer stores data 
input from the information retrieval technique retrieval result table and data sent from relational 
judgment module (3) in pairs as shown in Figure 4. When two or more retrieval techniques are 
evaluated, data input from the retrieval technique retrieval result and the external evaluation value 
table are replaced and processed in sequence. Next, as step 54, the results of the retrieval technique 
for each retrieval condition and data identification and the results from relational module (3) are 
compared in the extemal evaluation value and retrieval technique determination module. When the 
two values are equal, as step 55, a signal is sent to constant value generation module (4), a value of 
0, for example, is generated, and it is sent to average penalty calculation module (6). 

[0055] 

When the result of comparison at step 54 is that the value determined by the retrieval 
technique and by relational judgment module (3) are different, the value for the number of people 
that think it should be retrieved and the value for the number of people that think it should not be 
retrieved in the evaluation value storage buffer are sent to penalty calculation module (5). As step 
56, a penalty for when the information value from the retrieval technique and from relational 
judgment module (3) are different is calculated with a predetermined formula in penalty 
calculation module (5). 

[0056] 

One method of realizing the penalty calculation at step 56 is to use the technique using 
entropy in the first application example, for example, so an explanation will be omitted here. 

[0057] 

For a value that represents the degree of agreement between retrieval technique and human 
judgments obtained by constant value generation module (4) and penalty calculation module (5), 
the average value is calculated as step 57, and an evaluation value is generated to express the 
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degree of agreement between human and retrieval technique judgments for the entire retrieval 
technique. 

[0058] 

In actual evaluation value calculation buffer (41), as step 58, the amount of information is 
calculated from the number of humans that think it should be retrieved and the number of humans 
that think it should not be retrieved. Calculation of the amount of information can be accomplished 
with the same technique as calculation of entropy in the first application example, 
for example, so an explanation will be omitted, since it can easily be analogized. 

[0059] 

In imaginary evaluation value calculation buffer (42), as step 59, the majority side of the 
number of people that think something should be retrieved and the number of people that think it 
should not be retrieved is reduced by a predetermined number of people, for example, 1 person, 
and the number of people reduced is added to the minority side. So noise can be added to the result 
of human judgment. 

[0060] 

Next, as step 60, the amount of information for a case where noise is provided is calculated 
to give an imaginary evaluation value. Calculation of the imaginary evaluation value is the same as 
the calculation of entropy in the first application example, so an explanation will be omitted here. 
Processing in actual evaluation value calculation module (41) and imaginary evaluation value 
calculation module (42) is performed for all cases as is apparent from Figure 1 1 . 

[0061] 

Next, as step 61, the evaluation value, actual evaluation value, and imaginary evaluation 
value are stored m intermediate buffer (43) along with a number assigned to the retrieval 
technique. 

[0062] 

The processing between step 51 to step 61 is performed for all the given retrieval 
techniques. When it is determined at step 61-1 that processing for all given retrieval techniques is 
completed, the retrieval technique number and evaluation value are stored in intermediate buffer 
(43). As step 62, the order of the retrieval techniques in the order of similarity to human judgment 
is determined in technique order calculation module (44) from the evaluation value and the actual 
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evaluation value and imaginary evaluation value. In addition, as step 63, the technique retrieval 
numbers are output in order of the smaller evaluation value. 



Technique order evaluation module (44) is realized with a configuration as shown in 
Figure 13, for example. Technique order calculation module (44) is constituted from 4 modules: 
confidence interval calculation module (71) that calculates a confidence interval from the three 
values of evaluation value, actual evaluation value and imaginary evaluation value sent from 
intermediate buffer (43), sequencing module (72) that sequences the retrieval techniques relative 
to the evaluation value and calculated lower and upper Hmits of each, synthesizing uniform 
calculation module (73) that generates one order value according to a predetermined formula from 
the respective order of the lower limit, upper limit and evaluation value for individual retrieval 
techniques assigned by sequencing module (72), and sorting module (74) that arranges and outputs 
the order value calculated for each retrieval technique in order from the smallest. 



As processing in technique order calculation module (44), as shown in Figure 14, as step 
81, for example, the lower limit value and upper limit value are calculated using Formulas (5) and 
(6) below in confidence interval calculation module (71) into which are input the evaluation value 
and the actual evaluation Value and imaginary evaluation value from intermediate buffer (43). 



[0063] 



[0064] 



[0065] 



Formulas 4 




Key: 1 



Lower limit value 

Evaluation value 



2 
3 
4 
5 



Actual evaluation value 
hnaginary evaluation value 
Upper limit value 



The calculated upper limit and lower limit value and the evaluation value are stored in a buffer in 
sequencing module (72) as step 82. Steps 81 and 82 are repeated until calculation of the upper limit 
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value and lower limit value for all the retrieval techniques is completed. After this, it is determined 
whether calculation of the upper Hmit value and lower limit value has been completed for all the 
retrieval techniques at step 80. After calculation is completed, as step 83, a number is assigned to 
each technique iii order from the smallest for each of the lower limit value, upper limit value, and 
evaluation value in alignment and order determination part (72a) in sequencing module (72). This 
processing can easily be realized with a general sorting algorithm, so a detailed explanation will be 
omitted. 

[0066] 

Next, as step 84, a synthesized order is calculated from the lower limit order, evaluation 
order, and upper limit order in synthesized order value calculation module (73). The calculation 
method for synthesized order can be found by totaling each order as shown in the figure of the 
buffer in synthesized order calculation module (73) in Figure 13, for example. 

[0067] 

For the value for each of the lower limit order and upper limit order, the lower limit order 
represents retrieval efficiency when viewed optimistically, and the upper limit value represents 
retrieval efficiency when viewed pessimistically. Here, as another calculation method for the 
synthesized order value, if a pessimistic viewpoint is taken for the retrieval technique so that 
techniques with a larger upper limit value are discarded, when the total of the lower limit order, the 
upper limit order, and the evaluation order is found, for example, the value of the upper limit order 
is added two more times. Then, in the case of Figure 13, technique A will be 13, technique B 8, 
technique C 9, and technique D 20, and the order value of technique B will be the smallest. 

[0068] 

Next, as step 85, technique numbers are output in order of the smaller synthesized order 
values in sorting module (74). Alignment processing of the synthesized order values in sorting 
module (74) can easily be accomplished with a general sorting algorithm, so a detailed explanation 
will be omitted. 

[0069] 

Next, a third application example of this invention will be explained. With the third 
application example, the same configuration can be added to both the first and the second 
application examples described above, so here it will be explained using the first application 
example. 
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[0070] 

A block diagram in the third application example can be represented as shown in Figure IS. 
In the configuration shown in Figure 15, same number check module (9) and same number 
occurrence constant generation module (10) are added to the configuration of the first application 
example. 

[0071] 

For processing in the third application example, as shown in Figure 16, the number of 
people that think it should be retrieved and the number of people that think it should not be 
retrieved are compared at same number check module (9) to determine whether or not they are the 
same number as step 91 between step 1 1 and step 21 in Figure 5. Next when the number of people 
is the same number, as step 92,-1, for example, is sent to extemal evaluation value and 
retrieval technique agreement determination module (1), along with the number of people that 
think it should be retrieved and the number of people that think it should not be retrieved, in place 
of the results fi-om relational judgment module (9) as impossible to determine. 

[0072] 

When the number of people in the results for determining whether or not something should 
be retrieved is not the same, step 12 is executed. Processing in steps 12 and 13 are the same as the 
first application example, so an explanation will be omitted. 

[0073] 

Next, in place of step 14 shown in Figure 5, as step 93, the results of the retrieval technique 
for each retrieval condition and data identifier and the results of relational judgment module (3) are 
compared in extemal evaluation value and retrieval technique agreement determination module 
(1). When the two values are equal, or determination is impossible, for example, when -1 is given 
to the determination flag produced by relational judgment module (3) in the buffer, as step 15, a 
signal is sent to same number occurrence constant generation module (10), a value of 0, for 
example, is generated, and it is sent to average penalty calculation module (6). Processing after 
step 15 is the same as the first application example, so an explanation will be omitted. 

[0074] 

When the same fimction is added to the technique in the second application example, it can 
be achieved by adding same number check module (9) between extemal evaluation value totaling 
module (2) and relational judgment module (3) in Figure 1 1 in the same way. Processing when it is 
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added is the same as when same number check module (9) is added to the configuration of the first 
application example, and an explanation will be omitted, since it can easily be analogized. 

[0075] 

An example of data in the buffer in external evaluation value and retrieval technique 
agreement determination module (1) in the third application example is as shown in Figure 17. 

[0076] 

Next, the results obtained when the capabilities of this invention were tested are shown 

below. 

Experiment conditions: 

Number of data to be retrieved used for the experiment: 12 (retrieval objects are composed 
of a set of keywords accompanying newspaper articles selected at random, and the nouns, proper 
nouns, and verbal nouns comprising the newspaper article text) 

Number of samples of retrieved data actually used: 50 for each condition (randomly 
extracted) 

Retrieval techniques compared: 4 

1. AND retrieval: retrieved data acquired with all keywords in the retrieval object 

[0077] 

2. OR retrieval: retrieved data acquired with any one or more of the keywords in the 
retrieval object. 

[0078] 

3. Best retrieval: Retrieval performed artificially so that both the relevance factor and recall 
ratio will be high for evaluation by combining any one or more keywords in the retrieval object. 

4. Single word retrieval: data with a high degree of matching to single words in the set of 
nouns, proper nouns and verbal nouns comprising the text of the retrieval object is acquired. 

[0079] 

Number of test subjects: 9 

Number of extended one evaluation values: 12 x 50 = 600. 

Calculation methods in the evaluation: The technique where the concept of entropy was applied in 
the first through third application examples was used. With this experimental example, the number 
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of test subjects is an odd number, and the judgments made by human are the same, so the second 
and third application examples will be the same as the first application example. This is because 
the width between the upper limit value and the lower limit value in the second appUcation 
example is the same for any technique, and the human judgments as to whether or not something 
should be retrieved in the third application example will not be the same number. 

[0080] 

Results: the results when the experiment was performed are shown below. The relationship 
between recall ratio and relevance factor used in the past and the evaluation value used in this 
invention is reflected in the following table. 



[0081] 
Table 1 
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0 


0 












1.8% 


50.0% 


0. 145 




100. 0% 


36. 9% 


0. 3I& 




49. 0% 


96. 5% 


0. 045 




55. 3% 


73.0% 


0. 076 



Key: 1 Retrieval technique 

2 Recall ratio 

3 Relevance factor 

4 This invented technique 

5 AND retrieval 

6 OR retrieval 

7 Best retrieval 

8 Single word retrieval 



As is clear fi-om the table, using just the recall ratio and relevance factor, it cannot be 
determined whether OR retrieval is better, or best retrieval is better, or single word retrieval is 
better. However, in the case of OR retrieval, data that should not be retrieved in the judgment of 
the most people was included. With this invented technique, there is clearly a difference between 
OR retrieval, best retrieval, and single word retrieval. In addition, it also shows that the best 
retrieval that performed a retrieval so that an artificially high retrieval efficiency would occur is the 
optimum method. The result is that, with this invented technique, it was possible to determine the 
merits and disadvantages of the retrieval techniques to be, in order starting with the best retrieval 
technique, best retrieval, single word retrieval, AND retrieval, and OR retrieval. 
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[0082] 

Effect of the invention 

As explained above, with the information retrieval technique evaluation method mentioned 
in Claim 1 of this invention, in a database retrieval technique for acquiring data that matches the 
retrieval conditions from a database by inputting the retrieval conditions, by inputting judgment 
data for whether or not retrieval conditions are matched and data for the retrieval results obtained 
by each retrieval technique for retrieval conditions provided at least one or more humans and for 
each datum in the database or sample data extracted from the database, the merits and 
disadvantages of the retrieval efficiency of at least one or more retrieval techniques can be 
determined automatically. 

[0083] 

Also, with Claim 2, in addition to the effect described above, a difference occurs for the 
result of human judgment in the reliability of values viewed statistically when judgments are made 
by a large number of people and when judgments are made by a small number of people, e.g., 1 or 
2. The merits of a retrieval technique can be determined even though the people who are the test 
subjects in the human based evaluation differ regarding the retrieval technique. 

[0084] 

Also, with Claim 3, in addition to the effects described above, even if the number of people 
that make a judgment is an even number and the number of people that think it should be retrieved 
and the number or people that think it should not be retrieved is the same, by comparing the 
number of people that think it should be retrieved and the number of people that think it should not 
be retrieved, between one datum to be retrieved and one [other] datum to be retrieved, and 
generating a predetermined value when they are the same number, the merits of the retrieval 
system can be determined when an evaluation is performed with any number of people. 

[0085] 

Also, with the information retrieval technique evaluation device mentioned in Claim 4, in a 
database retrieval technique for acquiring data that matches the retrieval conditions from a 
database by inputting the retrieval conditions, by inputting judgment data for whether or not 
retrieval conditions are matched and data for the retrieval results obtained by each retrieval 
technique for retrieval conditions provided by at least one or more humans and for each datum in 
the database or sample data extracted from the database, the merits and disadvantages of the 
retrieval efficiency of at least one or more retrieval techniques can be determined automatically. 
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[0086] 

Also, with Claim 5, in addition to the effect described above, a difference occurs for the 
result of human judgment in the reliability of values viewed statistically when judgments are made 
by a large number of people and when judgments are made by a small number of people, e.g., 1 or 
2. The merits of a retrieval technique can be determined even though the people who are the test 
subjects in the human based evaluation differ regarding to the retrieval technique. 

[0087] 

Also, with Claim 6, in addition to the effects described above, even if the number of people 
that make a judgment is an even number and the number of people that think it should be retrieved 
and the number or people that think it should not be retrieved is the same, by comparing the 
number of people that think it should be retrieved and the number of people that think it should not 
be retrieved, between one datum to be retrieved and one [other] datum to be retrieved, and 
generating a predetermined value with a constant generation means when they are the same 
nxmiber, the merits of the retrieval system can be determined when an evaluation is performed with 
any number of people. 

Brief description of the figures 

Figure 1 is a system block diagram of an information retrieval technique evaluation device 
in a first application example of this invention. 

Figure 2 is a conceptual diagram that shows an evaluation method for retrieval efficiency. 

Figure 3 is a figure that shows the data structure and data examples in the information 
retrieval technique retrieval result table in the first application example. 

Figure 4 is a figure that shows the data structure and data examples in the extemal value 
evaluation table in the first application example. 

Figure 5 is a flowchart that shows processing in the information retrieval technique 
evaluation method in the first application example. 

Figure 6 is a figure that shows the data structure in the extemal evaluation value buffer in 
the first application example. 

Figure 7 is a system block diagram of the penalty calculation module in the first application 
example. 

Figure 8 is a flowchart that shows the flow of processing by the penalty calculation module 
in the first application example. 

Figure 9 is a system block diagram that shows another example of the penalty calculation 
module. 
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Figure 10 is a flowchart that shows the processing flow in the other example of the penaUy 
calculation module. 

Figure 1 1 is a system block diagram of an information retrieval technique evaluation 
device in a second application example. 

Figure 12 is a flowchart that shows the processing flow by the information retrieval 
technique evaluation method in the second application example. 

Figure 13 is a system block diagram of the technique order calculation module in the 
second application example. 

Figure 14 is a flowchart that shows the processing in the technique order calculation 
module in the second application example. ' 

Figure 15 is a system block diagram of an information retrieval technique evaluation 
device in the third application example. 

Figure 16 is a flowchart that shows the processing in the information retrieval technique 
evaluation method in the third application example. 

Figure 17 is a figure that shows the data structure and data examples in the external 
evaluation value buffer in the third application example. 

Explanation of reference symbols 

(1) . . . extemal evaluation value and retrieval technique agreement determination module, (2) . . . 
external evaluation value totaling module, (3) ... relational judgment module, (4) ... constant 
value generation module, (5) . . . penalty calculation module, (6) . . . average penalty calculation 
module, (7) . . . evaluation value buffer, (8) . . . sorting module, (9) . . . same number check module, 
(21) ... working buffer, (22) ... comparison module, (23) ... division module, (24) ... penalty 
value generation module, (31)... working buffer, (32) . . . number of people responding calculation 
module, (33) ... entropy calculation module, (34) ... penalty value generation module, (41) ... 
actual evaluation value calculation module, (42) imaginary evaluation value calculation module, 
(43) . . . intermediate buffer, (44) . . . technique order calculation module, (71) . . . confidence in- 
terval calculation module, (72) . . . sequencing module, (73) . . . synthesized order value calculation 
module, (74) ... sorting module. 
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Figure 1 

Key: 1 External evaluation value and retrieval technique agreement determination module 

2 External evaluation value totaling module 

3 Relational judgment module 

4 Constant value generation module 

5 Penalty calculation module 

6 Average penalty calculation module 

7 Evaluation value buffer 

8 Sorting module 

1 0 Information retrieval technique retrieval result table 

1 1 External evaluation value table 

1 2 Technique number 

13 Evaluation value 




Figure 2 
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Key: 1 Should be retrieved 

2 Recall ratio 

3 Judgment by human 

4 Wobble in human judgment 

5 Relevance factor 

6 Judgment by information retrieval technique 

7 Should not be retrieved 

8 Rate at which data that should not be retrieved are not retrieved 
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Figure 3 



Key: 1 
2 
3 
4 



5 
6 



Retrieval technique name 

Retrieval technique number 

Retrieval conditions — data identifier 

Judgment flag 

(0: should not be retrieved) 

1 : should be retrieved) 

AND retrieval 

Information retrieval technique retrieval result table 
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Figure 4 

Key: 1 Response of test subject 1 

2 Response of test subject 2 

3 Response of test subject 3 

4 Response of test subject 4 

5 Response of test subject 5 

6 Response of test subject 6 

7 Response of test subject 7 

8 Retrieval conditions ~ data identifier 

9 Judgment flag 

(0: should not be retrieved 1 : should be retrieved) 

1 0 External evaluation value table 



36 



Q ( 



^11 



1 3>A 




Yes 



i 



t^J^ 




2 0vH^^N^^ 



1 6 



J 



^esiiiamiiit/f 97 r finises, 



J 




1 8 



Figure 5 



Key: 1 Evaluation start 
2 End 

1 1 Count the number of people that think it should be retrieved and the number of 
people that think it should not be retrieved, for the data for each identifier in ex- 
ternal evaluation value table 
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12 Determine whether it should be retrieved from the number of people that think it 
should be retrieved and the number of people that think it should not be retrieved 

13 Data in information retrieval technique retrieval result table stored in evaluation 
value storage buffer 

14 Retrieval technique result and relational judgment module result the same? 

1 5 Constant value is generated 

16 One relational penalty is calculated from the number of people that think it should 
be retrieved and the number of people that think it should not be retrieved 

17 Average value (evaluation value) of penalties calculated 

1 8 Evaluation value stored in evaluation value buffer 
1 8- 1 All techniques input? 

19 Evaluation values rearranged in order of smaller 

20 Retrieval technique number is output. 
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Figure 6 



Key: 1 Retrieval conditions - identifier 

2 Judgment flag produced by information retrieval technique retrieval result table 

3 Judgment flag produced by relational judgment module 

4 Number of people that think it should be retrieved 

5 Number of people that think it should not be retrieved 

6 Extemal evaluation value buffer 
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Key: 1 Penalty calculation module 

21 Working buffer 

22 Comparison module 

23 Division module 
23a Numerator buffer 
23b Denominator buffer 

24 Penalty value generation module 

, i — ' 

Figure 8 

Key: 1 Penalty calculation module 
2 Module end 

25 Number of people that think it should be retrieved and number of people that think 
it should not be retrieved stored in working buffer 

26 Two values compared, smaller value stored in numerator buffer and larger value 
stored in denominator buffer 

27 Division performed using values in nimierator buffer and denominator buffer. 

28 Result of division subtracted from 1. 
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Key: 1 Penalty calculation module 

31 Working buffer 

32 Number of people responding calculation module 

33 Entropy calculation module 

34 Penalty value generation module 
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Key: 1 Penalty value calculation module 

2 Module end 

35 Number of people that think it should be retrieved and number of people that think 
it should not be retrieved are stored in working buffer 

36 Two values are added to calculate number of people responding. 

37 Entropy calculated 

38 Result of entropy subtracted from 1 
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Figure 1 1 

Key: 1 External evaluation value and retrieval technique agreement determination module 

2 Extemal evaluation value totaling module 

3 Relational judgment module 

4 Constant value generation module 

5 Penalty calculation module 

6 Average penalty calculation module 

10 Information retrieval technique retrieval result table 

1 1 Extemal evaluation value table 

1 2 Technique number 

1 3 Evaluation value 

14 Actual evaluation value 

1 5 Imaginary evaluation value 

4 1 Actual evaluation value calculation module 

42 Imaginary evaluation value calculation module 

43 Intermediate buffer 

44 Technique order calculation module 
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Key: 1 
2 



Evaluation start 
End 
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5 1 Number of people that think it should be retrieved and number of people that think 
it should not be retrieved counted for data for each identifier in external evaluation 
value table 

52 Determine whether it should be retrieved fi-om the number of people that think it 
should be retrieved and number of people that think it should not be retrieved 

53 Data in information retrieval technique retrieval result table stored in evaluation 
value storage buffer. 

54 Result of retrieval technique and result of relational judgment module the same? 

55 Constant value generated 

56 One relational penalty calculated from the number of people that think it should be 
retrieved and number of people that think it should not be retrieved 

57 Average value (evaluation value) of penalties calculated 

58 Amount of information for one relation calculated from the number of people that 
think it should be retrieved and number of people that think it should not be re- 
trieved, and the average value for all relations found 

59 Value on the majority side is decreased and added to the minority side to generate 
noise. 

60 Amount of information for one relation is calculated using the nimiber of people 
with noise, and the average value for all relations is found. 

61 Evaluation value stored in evaluation value buffer 
61-1 All techniques input? 

62 Rearranged in order of smaller evaluation value 
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Figure 13 



Key: 1 


Lower limit 


2 


Evaluation value 


3 


Upper limit 


4 


Technique nimiber 


5 


Lower limit value 


6 


Lower limit order 


7 


Evaluation value 


S 


Evaluation value order 


9 


Upper limit value 


10 


Upper limit order 


11 


Technique [A, B, etc.] 


12 


Synthesized order value 


13 


Technique order calculation module 


71 


Confidence interval calculation module 


72a 


Arrangement and order determination part 


73 


Synthesized order value calculation module 


74 


Sorting module 
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Figure 14 



Key: 1 Technique order calculation module 

2 End 

80 Processing done for all data evaluation values? 

81 Lower limit value and upper limit value calculated 

82 Stored in buffer in sequencing module 

83 Sequencing performed for lower limit value, upper limit value, and evaluation 
value 

84 Synthesized order calculated 

85 Output in order of smaller synthesized order value. 
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Figure 15 
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Key: 1 External evaluation value and retrieval technique agreement determination module 

2 External evaluation value totaling module 

3 Relational judgment module 

4 Constant value generation module 

5 Penalty calculation module 

6 Average penalty calculation module 

7 Evaluation value buffer 

8 Sorting module 

9 Same number check module 

10 Same number occurrence constant generation module 

1 1 Information retrieval technique retrieval result table 

12 External evaluation value table 

1 3 Technique number 

1 4 Evaluation value 
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Figure 16 

Key: 1 Evaluation start 
2 End 

1 1 Number of people that think it should be retrieved and number of people that think 
it should not be retrieved counted for data for each identifier in external evaluation 
value table 

12 Determine whether it should be retrieved from the number of people that think it 
should be retrieved and number of people that think it should not be retrieved 

13 Data in information retrieval technique retrieval result table stored in evaluation 
value storage buffer 
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1 5 Constant value is generated 

16 One relational penalty is calculated from the number of people that think it should 
be retrieved and the number of people that think it should not be retrieved 

17 Average value (evaluation value) of penalties calculated 

18 Evaluation value stored in evaluation value buffer 
1 8- 1 All techniques input? 

19 Rearranged in order of smaller evaluation value 

20 Retrieval technique number is output 

91 Number of people that think it should be retrieved and number of people that think 
it should not be retrieved the same number? 

92 Decision that it is not possible to determine 

93 Result of retrieval technique and result of relational judgment module the same, or 
detennination impossible? 
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Figure 17 



Key: 1 Retrieval conditions - identifier 

2 Judgment flag produced by information retrieval technique retrieval result table 

3 Judgment flag produced by relational judgment module 

4 Number of people that think it should be retrieved 

5 Number of people that think it should not be retrieved 

6 External evaluation value buffer 



